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Abstract. We describe an algorithm for deciding the first-order multisorted theory BAPA, 
which combines 1) Boolean algebras of sets of uninterpreted elements (BA) and 2) Pres- 
burger arithmetic operations (PA). BAPA can express the relationship between integer vari- 
ables and cardinalities of sets, and supports arbitrary quantification over both sets and inte- 
gers. 

Our motivation for BAPA is deciding verification conditions that arise in the static analysis 
of data structure consistency properties. Data structures often use an integer variable to keep 
track of the number of elements they store; an invariant of such a data structure is that the 
value of the integer variable is equal to the number of elements stored in the data structure. 
When the data structure content is represented by a set, the resulting constraints can be cap- 
tured in BAPA. BAPA formulas with quantifier alternations arise when annotations contain 
quantifiers themselves, or when proving simulation relation conditions for refinement and 
equivalence of program fragments. Furthermore, BAPA constraints can be used to extend 
the techniques for proving the termination of integer programs to programs that manipulate 
data structures, and have applications in constraint databases. 

We give a formal description of a decision procedure for BAPA, which implies the decid- 
ability of the satisfiability and validity problems for BAPA. We analyze our algorithm and 
obtain an elementary upper bound on the running time, thereby giving the first complexity 
bound for BAPA. Because it works by a reduction to PA, our algorithm yields the decidabil- 
ity of a combination of sets of uninterpreted elements with any decidable extension of PA. 
Our algorithm can also be used to yield a space-optimal decision procedure for BA though 
a reduction to PA with bounded quantifiers. 

We have implemented our algorithm and used it to discharge verification conditions in the 
Jahob system for data structure consistency checking of Java programs; our experience with 
the algorithm is promising. 

1 Introduction 

Program analysis and verification tools can greatly contribute to software reliability, es- 
pecially when used throughout the software development process. Such tools are even 
more valuable if their behavior is predictable, if they can be applied to partial programs, 
and if they allow the developer to communicate the design information in the form of 
specifications. Combining the basic idea of [22,28] with decidable logics leads to anal- 
ysis tools that have these desirable properties. Such analyses are precise (because for- 
mulas represent loop-free code precisely) and predictable (because the checking of ver- 
ification conditions terminates either with a realizable counterexample or with a sound 
claim that there are no counterexamples). 

A key challenge in this approach to program analysis and verification is to identify 
a logic that captures an interesting class of program properties, but is nevertheless de- 
cidable. In [41^-3, 80] we identify the first-order theory of Boolean algebras (BA) as a 
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useful language for reasoning about dynamically allocated objects: BA allows express- 
ing generalized typestate properties and reasoning about data structures as dynamically 
changing sets of objects. BA is known to be decidable [45,67]. 

The motivation for this paper is the fact that we often need to reason not only about 
the data structure content, but also about the size of the data structure. For example, we 
may want to express the fact that the number of elements stored in a data structure is 
equal to the value of an integer variable that is used to cache the data structure size, or 
we may want to introduce a decreasing integer measure on the data structure to show 
program termination. These considerations lead to a natural generalization of the first- 
order theory of BA of sets, a generalization that allows integer variables in addition 
to set variables, and allows stating relations of the form \A\ = k meaning that the 
cardinality of the set A is equal to the value of the integer variable k. Once we have 
integer variables, a natural question arises: which relations and operations on integers 
should we allow? It turns out that, using only the BA operations and the cardinality 
operator, we can already define all operations of PA. This leads to the structure BAPA, 
which properly generalizes both BA and PA. 

As we explain in Section 2, a version of BAPA was shown decidable already in [19] 
(which also proves the well-known Feferman-Vaught theorem [29, Section 9.6] about 
the products of first-order theories). Recently, a decision procedure for a fragment of 
BAPA without quantification over sets was presented in [79], cast as a multi-sorted the- 
ory. Starting from [43] as our motivation, we have observed in [38] the decidability of 
the full BAPA (which was initially left open in [79]). After our report [38], an algorithm 
for a language between BA and BAPA was presented in [62] as a way of evaluating 
queries in constraint databases. The constraints in [62] allow only constant integer pa- 
rameters and not integer variables; moreover, [62] still leaves open the complexity of 
the algorithm. 

Our paper gives the first formal description of a decision procedure for the full 
first-order theory of BAPA. Furthermore, we analyze our decision procedure and show 
that it yields an elementary upper bound on the complexity of BAPA. Our result is 
the first upper complexity bound on BAPA; along with a lower bound from PA, we 
obtain a good estimate of BAPA worst-case complexity. We have also implemented our 
decision procedure; we report on our initial experience in using the decision procedure 
in the context of a system for checking data structure consistency. 
Contributions. We summarize the contributions of our paper as follows. 

1. As a motivation for BAPA, we show in Section 3 how BAPA constraints can be 
used for program analysis and verification by expressing 1) data structure invari- 
ants, 2) the correctness of procedures with respect to their specifications, 3) simu- 
lation relations between program fragments, and 4) termination conditions for pro- 
grams that manipulate data structures. 

2. We present an algorithm a (Section 4) that translates BAPA sentences into PA 
sentences by translating set quantifiers into integer quantifiers. The algorithm is 
surprisingly simple (the entire source code is included in the Appendix, Section 12) 
and shows a deep connection between BA and PA. 

3. We analyze our algorithm a and show that it yields an elementary upper bound on 
the worst-case complexity of the validity problem for BAPA sentences that is close 



to the bound on PA sentences themselves (Section 5). This is the first complexity 
bound for BAPA, and is the main contribution of this paper. 

4. We discuss our experience in using our implementation of BAPA to discharge 
verification conditions generated in the Jahob verification system [34]. 

5. In addition, we note the following related complexity, decidability and undecidabil- 
ity results: 

(a) We show that PA sentences generated by translating pure BA sentences can be 
checked for validity in singly exponential space, which is a good bound in the 
light of alternating exponential lower bound for BA (Section 5.2). 

(b) We show how to extend our algorithm to infinite sets and predicates for distin- 
guishing finite and infinite sets (Section 10). 

(c) We examine the relationship of our results to the monadic second-order logic 
(MSOL) of strings (Section 1 1). In contrast to the undecidability of MSOL with 
equicardinality operator (Section 1 1 .2), we identify a combination of MSOL 
over trees with BA that is decidable. This result follows from the fact that our 
algorithm a enables adding BA operations to any extension of PA, including 
decidable extensions such as MSOL over strings (Section 11.1). 

A preliminary version of our results, including the algorithm and complexity analysis 
appear in [38], which also contains some background on quantifier elimination. 

2 The First-Order Theory BAPA 

Figure 3 presents the syntax of Boolean Algebra with Presburger Arithmetic (BAPA), 
which is the focus of this paper. We next present some justification for the operations in 
Figure 3. Our initial motivation for BAPA was the use of BA to reason about data struc- 
tures in terms of sets [40] . Our language for B A (Figure 1 ) allows cardinality constraints 
of the form \A | = C where C is a constant integer. Such constant cardinality constraints 
are useful and enable quantifier elimination for the resulting language [45,67] . However, 
they do not allow stating constraints such as \A\ = \B\ for two sets A and B, and cannot 
represent constraints on changing program variables. Consider therefore the equicardi- 
nality relation eqcard(A, B) that holds iff A\ = \B\, and consider BA extended with re- 
lation eqcard(,4, B). Define the ternary relation plus(yl, B, C) <$=> (\A\ = |B| + |C|) 
by the formula 3xi. 3x2- X\C\Xq. — A C — X1UX2 A eqcard(A, cci) Aeqcard(_B, xz)- 
The relation p\us(A,B,C) allows us to express addition using arbitrary sets as rep- 
resentatives for natural numbers. Moreover, we can represent integers as equivalence 
classes of pairs of natural numbers under the equivalence relation (x, y) ~ (u, v) ^=^> 
x + v = u + y. This construction allows us to express the unary predicate of being non- 
negative. The quantification over pairs of sets represents quantification over integers, 
and quantification over integers with the addition operation and the predicate "being 
non-negative" can express all PA operations, presented in Figure 2. Therefore, a natural 
closure under definable operations leads to our formulation of the language BAPA in 
Figure 3, which contains both sets and integers. 

The argument above also explains why we attribute the decidability of BAPA to [19, 
Section 8], which showed the decidability of BA over sets extended with the equicardi- 
nality relation eqcard, using the decidability of the first-order theory of the addition of 
cardinal numbers. 
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F ::= A | Fi A F 2 | Fi V F 2 | ^F | 
3k. F | Vfc.F 

A ::= 7i = T 2 | Ti < T 2 CdvdT 

T ::= C | Ti + T 2 | Ti - T 2 | C ■ T 

C::= ...-2| -1 | | 1 | 2... 

Fig. 2. Formulas of Presburger Arithmetic (PA) 



Fig. 1. Formulas of Boolean Algebra (BA) 

F ::= A | Fi A F 2 | Fi V F 2 | -iF | 

Bz.F | Vr.F | Bfc.F | Vfc.F 
A ::= Si = S 2 | Si C B 2 

Fi = T 2 |Ti <T 2 | CdvdT 
B ::= x | | 1 | Si U S 2 | Si n B 2 | S c 
T ::= fc | C | MAXC | Ti + F 2 | Fi - T 2 | C ■ F | | S | 
C :~ ...-2 | -1 | | 1 2... 

Fig. 3. Formulas of Boolean Algebras with Presburger Arithmetic (BAPA) 

The language BAPA has two kinds of quantifiers: quantifiers over integers and quan- 
tifiers over sets; we distinguish between these two kinds by denoting integer variables 
with symbols such as k, I and set variables with symbols such as x, y. We use the 
shorthand 3 + k.F(k) to denote 3k. k > A F(k) and, similarly V + k.F(k) to denote 
Vfc.fc > =>• F(k). In summary, the language of BAPA in Figure 3: 1) subsumes the 
language of PA in Figure 2; 2) subsumes the language of BA in Figure 3; and 3) con- 
tains non-trivial combination of these two languages in the form of using the cardinality 
of a set expression as an integer value. 

The semantics of operations in Figure 3 is the expected one. We interpret integer 
operations in standard way, and interpret sets in boolean algebra over subsets of a fi- 
nite sets. The MAXC constant denotes the size of the finite universe hi, so we require 
MAXC = \U\ in all models. (Our results also extend to infinite sets, see Section 10 for 
the discussion.) 



3 Applications of BAPA 



This section illustrates the importance of BAPA constraints. Section 3.1 shows the uses 
of BAPA constraints to express and verify data structure invariants as well as procedure 
preconditions and postconditions. Section 3.2 shows how a class of simulation relation 
conditions can be proved automatically using a decision procedure for BAPA. Finally, 
section 3.3 shows how BAPA can be used to express and prove termination conditions 
for a class of programs. 



3.1 Verifying Data Structure Consistency 

Figure 4 presents a procedure insert in a language that directly manipulates sets. Such 
languages can either be directly executed [18,66] or can be derived from executable 
programs using an abstraction process [41,43]. The program in Figure 4 manipulates 
a global set of objects content and an integer field size. The program maintains an 
invariant / that the size of the set content is equal to the value of the variable size. 
The insert procedure inserts an element e into the set and correspondingly updates the 
integer variable. The requires clause (precondition) of the insert procedure is that the 
parameter e is a non-null reference to an object that is not stored in the set content. 
The ensures clause (postcondition) of the procedure is that the size variable after the 
insertion is positive. Note that we represent references to objects (such as the procedure 
parameter e) as sets with at most one element. An empty set represents a null reference; 
a singleton set {o} represents a reference to object o. The value of a variable after 
procedure execution is indicated by marking the variable name with a prime. 

var content : set; 

var size : integer; 

invariant I <=>• (size = |content|); 

procedure insert(e : element) 
maintains / 

requires |e| = 1 A |e n content] = 
ensures size' > 

{ 

content := content U e; content := content U e; size := size + 1; 

size := size + 1; , ,. 

} < size' > A size' = |content'| > 

Fig. 4. An Example Procedure Fig. 5. Hoare Triple for insert Procedure 

Ve. Vcontent. Vcontent'. Vsize. Vsize'. 
(|e| = 1 A |e Pi content| = A size = |content| A 
content' = content U e A size' = size + 1) =£- 
size' > A size' = |content'| 

Fig. 6. Verification Condition for Figure 5 

In addition to the explicit requires and ensures clauses, the insert procedure main- 
tains an invariant, /, which captures the relationship between the size of the set content 
and the integer variable size. The invariant / is implicitly conjoined with the requires 
and the ensures clause of the procedure. The Hoare triple [28] in Figure 5 summarizes 
the resulting correctness condition for the insert procedure. 

Figure 6 presents a verification condition corresponding to the Hoare triple in Fig- 
ure 5. Note that the verification condition contains both set and integer variables, con- 
tains quantification over these variables, and relates the sizes of sets to the values of 
integer variables. Our small example leads to a particularly simple formula; in general, 
formulas that arise in the compositional analysis of set programs with integer variables 
may contain alternations of existential and universal variables over both integers and 



< |e| = 1 A |e 1*1 content| =0A size = |content| > 



sets. This paper shows the decidability of such formulas and presents the complexity of 
the decision procedure. 

3.2 Proving Simulation Relation Conditions 

Another example of where B APA constraints are useful is when proving that a given re- 
lation on states is a simulation relation between two program fragments. Figure 7 shows 
one such example. The concrete procedure startl manipulates two sets: a set of running 
processes and a set of suspended processes in a process scheduler. The procedure startl 
inserts a new process into the set of running processes, unless there are already too many 
running processes. The procedure start2 is a version of the procedure that operates in 
a more abstract state space: it maintains only the union of all processes and a num- 
ber of running processes. Figure 7 shows a forward simulation relation r between the 
transition relations for startl and start2. The standard simulation relation diagram con- 
dition [46] is Vsi.Vs' 1 .Vs 2 .(ti(si, s[)Ar(s\, s 2 )) => 3s' 2 . (£2(^2, s 2 )Ar(s 2 , s' 2 )). In the 
presence of preconditions, t\(s\, s[) = (pre 1 (si) => post 1 (si, s[)) and £2(52, s 2 ) = 
(pre 2 (s2) =>• post 2 (s2, s 2 )), and sufficient conditions for simulation relation are: 

1. Vsi.Vs2-r(si, S2) A pre 2 (s2) =>• pre 1 (si) 

2. Vsi.Vs' 1 .Vs2-3s 2 . r(si,S2) A post 1 (si,si) A pre 2 (s2) =>■ post 2 (s2,s 2 ) A r(s2, s 2 ) 

Figure 7 shows BAPA formulas that correspond to the simulation relation conditions in 
this example. Note that the second BAPA formula has a quantifier alternation, which 
illustrates the relevance of quantifiers in BAPA. 



var R : set; 
var S : set; 



var P : set; 
var k : set; 



, , . . procedure start2(x) 

procedure start l(x) . ,_ _ . , , , . , ..,„. 

. „ n A \ ', , A ,_, ... VD requires x % P A \x \ = 1 A k < MAXR 

requires x % R A \x \- 1 A \R \< MAXR M _,, fc _, , , '' . , , ,, 

„, „ ' _ , 1 ' ensures P = P U x A k =k+l 
ensures R =RUiAS =S 



{ 

R — R U x; 

} 

Simulation relation r: 

r((R,S),(P,k)) = (P = RUSAk= |R|) 



P := PUl; 
k:=k + l; 



Simulation relation conditions in BAPA: 

1. Vz,R,S, P,k.(P= RuSAk= |R|) A (x £ PA\x\ = lAk< MAXR) ^ 

(x g RA|x| = 1 A|R| < MAXR) 

2. Vx, R, S, R', S', P, k.3P', k'.((P = R U S A k = |R|) A (R' = R U x A S' = S) A 

(x g P A \x\ = 1 A k < MAXR)) ^ 
(P' = P U x A k' = k + 1) A (P' = R' U S' A k' = |R'|) 
Fig. 7. Proving simulation relation in BAPA 

3.3 Proving Termination of Programs 

We next show how BAPA is useful for proving program termination. A standard tech- 
nique for proving termination of a loop is to introduce a ranking function / that maps 



var iter : set; 



Ranking function: 

f(s) = \s\ 



procedure iterate() 
{ 

while iter / do Transition relation: 



var e : set; 
e := choose iter; 
iter := iter \ e; 
process (e); 
done 
} 

Fig. 8. Terminating program 



t(iter, iter') = (Be. |e| = 1 A e C iter A iter' = iter \ e) 

Termination condition in BAPA: 

Viter.Viter'. (3e.|e| = 1 A e C iter A iter' = iter \ e) 
=> | iter' | < | iter | 

Fig. 9. Termination proof for Figure 8 



program state into a non-negative integer, and the prove that the value of the func- 
tion decreases at each loop iteration. In other words, if t(s, s') denotes the relation- 
ship between the state at the beginning and end of the procedure, then the condition 
Vs.Vs'.t(s, s') =4> f(s) > f(s') holds. Figure 8 shows an example program that pro- 
cesses each element of the initial value of set iter; this program can be viewed as ma- 
nipulating an iterator over a data structure that implements a set. Using the the ability to 
take cardinality of a set allows us to define a natural ranking function for this program. 
Figure 9 shows the termination proof based on such ranking function. Note that, because 
the loop contains a local variable, the resulting loop transition relation contains an ex- 
istential quantifier. The resulting termination condition can be expressed as a formula 
that belongs to BAPA, and can be discharged using our decision procedure. In general, 
we can reduce the termination problem of programs that manipulate both sets and in- 
tegers to showing a simulation relation with a fragments of a terminating program that 
manipulates only integers, which can be proved terminating using techniques [55-57]. 
The simulation relation condition can be proved correct using our BAPA decision pro- 
cedure whenever the simulation relation is expressible with a BAPA formula. 

4 Decision Procedure for BAPA 

This section presents our algorithm, denoted a, which reduces a BAPA sentence to an 
equivalent PA sentence with the same number of quantifier alternations and an expo- 
nential increase in the total size of the formula. This algorithm has several desirable 
properties: 

1. Given the space and time bounds for PA sentences [61], the algorithm a yields 
reasonable space and time bounds for deciding BAPA sentences (Section 5). 

2. The algorithm a does not eliminate integer variables, but instead produces an equiv- 
alent quantified PA sentence. The resulting PA sentence can therefore be decided 
using any decision procedure for PA, including the decision procedures based on 
automata [23,31,44]. 

3. The algorithm a can eliminate set quantifiers from any extension of PA. We thus 
obtain a technique for adding a particular form of set reasoning to every extension 
of PA, and the technique preserves the decidability of the extension. One example 



of decidable theory that extends PA is MSOL over strings, see See Section 1 1 for 
the discussion. 
4. For simplicity we present the algorithm a as a decision procedure for formulas 
with no free variables, but the algorithm can be used to transform and simplify 
formulas with free variables as well, because it transforms one quantifier at a time 
starting from the innermost one. Because of this feature, we can use the algorithm 
a to project out local state components from formulas that describe invariants and 
transition relations, and simplify the resulting formulas. 
We next describe the algorithm a for transforming a BAPA sentence Fo into a PA 
sentence. As the first step of the algorithm, transform Fo into prenex form 

QpVp — QiVi. F(vi, . . . ,v p ) (1) 

where F is quantifier-free, and each quantifier QiVi is of one the forms 3k, Vfc, 3y, My 
where k denotes an integer variable and y denotes a set variable. 

The next step of the algorithm is to separate F into BA part and PA part. To achieve 
this, replace each formula x = y where x and y are sets, with the conjunction x C 
y A y C x, and replace each formula x C y with the equivalent formula \x fl y c \ = 0. 
In the resulting formula, each set x occurs in some term |t(a;)|. Next, use the same 
reasoning as when generating disjunctive normal form for propositional logic to write 
each set expression t(x) as a union of cubes (regions in Venn diagram [74]) of the form 
A™=i X V where x"' is either x% or x\; hence there are m = 2" cubes si, . . . , s m . 
Suppose that t(x) = Sj 1 U. . . Sj a ; then replace the term \t(x)\ with the term X^=i \ s jt\- 
In the resulting formula, each set x appears in an expression of the form | s, | where s j is 
a cube. For each Sj introduce a new variable l{. Then the resulting formula is equivalent 
to 

Q p v p ....QiVi. 

3+h,...,l m . AZi\si\=li A Gi K) 

where G\ is a PA formula and m — 2™. Formula (2) is the starting point of the main 
phase of algorithm a. The main phase of the algorithm successively eliminates quanti- 
fiers QiVi, . . . , QpVp while maintaining a formula of the form 

{^CpVp . . . (cgrVr- 

3 + h...l q . M =1 \Si\=li A G r 

where G r is a PA formula, r grows from 1 to p + 1, and q = 2 e where e for < e < n 
is the number of set variables among v p ,...,v r . The list si, . . . , s q is the list of all 2 e 
partitions formed from the set variables among v p , . . . ,v r . 

We next show how to eliminate the innermost quantifier Q r v r from the formula (3). 
During this process, the algorithm replaces the formula G r with a formula G,+i which 
has more integer quantifiers. If v r is an integer variable then the number of sets q re- 
mains the same, and if v r is a set variable, then q reduces from 2 e to 2 e_1 . We next 
consider each of the four possibilities 3k, Vfc, 3y, My for the quantifier Q r v r . 

Consider first the case 3k. Because k does not occur in A? =1 \si\ = h, simply move 

the existential quantifier to G r and let G r +i = 3k.G r , which completes the step. 

For universal quantifiers, observe that 

i 

-.(3+Jl . . . I,. /\|Si|=Ji A G r ) 



(3) 



is equivalent to 3 + l\ . . . l q . A?=i l s i| — h A ~^G r , because the existential quantifier is 
used as a let-binding, so we may first substitute all values k into G r , then perform the 
negation, and then extract back the definitions of all values U. Given that the universal 
quantifier Vfc can be represented as a sequence of unary operators ->3fc-i, from the elim- 
ination of 3k we immediately obtain the elimination of Vfc; it turns out that it suffices 
to let G r +i = Vfc.G r . 

We next show how to eliminate an existential set quantifier By from 



i 



f\ \Si\=li A G r (4) 

which is equivalent to 3 + h . . .l q . (3y. /\ q i=1 \si\ = h) A G r . This is the key step of 
the algorithm and relies on the following lemma, whose proof is in Section 9. 

Lemma 1. Let bi, . . . ,b n be finite disjoint sets, and l\, . . . , l n , k\, . . . , k n be natural 
numbers. Then the following two statements are equivalent: (1) There exists a finite set 
y such that A™=i l^» ^ v\ ~ ^i A |6, (~l y c \ — U and (2) /\™ =1 \bi\ = k{ + U. Moreover, 
the statement continues to hold if for any subset of indices i the conjunct \biC\y\ = hi 
is replaced by \bi (~l y\ > ki or the conjunct \bi (~l y c \ — h is replaced by |6, n y c \ > U, 
provided that \bi\ = ki +li is replaced by \bi\ > fcj + U, as indicated in Figure 10. 





original formula 


eliminated form 


3y. . 


• \bC\y\ > kA\bny c \ > I ... 


\b\>k + l 


3y. . 


. \bf]y\ = k A \bCiy c \ > I ... 


\b\ >k + l 


3y. . 


■ \bny\ > k A |&Hy c | = I ... 


\b\ >k + l 


3y. . 


. \bCiy\ = k A |&Hy c | = I ... 


\b\ =k + l 



Fig. 10. Rules for Eliminating Quantifiers from Boolean Algebra Expressions 

In the quantifier elimination step, assume without loss of generality that the set variables 
Si, . . . , s q are numbered such that S2i-i = s^ n y c and S2i = s' { n y for some cube s^. 
Then apply Lemma 1 and replace each pair of conjuncts 

\ s 'i n y c \ - hi-i A \s'i C\y\ — hi 
with the conjunct \s' t \ = hi-i + hi, yielding formula 

3+h...l q . /\|s-| = « 2i -i + fe A G r (5) 

i=l 

for q' = 2 e ~ 1 . Finally, to obtain a formula of the form (3) for r + 1, introduce fresh 
variables l\ constrained by l\ = foi-i + hi, rewrite (5) as 

q' q' 

3 + l' 1 ...l' q ,. /\K|=/J A (3h...l q . f\l'i^hi-i+hi A G P ) 

j=i i=i 

and let 



G r +1 =3 h ■ ■ ■ lq- A h — hi-1 + hi A G r (6) 

i=l 

This completes the description of elimination of an existential set quantifier By. 

To eliminate a set quantifier Vy, proceed analogously: introduce fresh variables l\ = 

hi-i + hi and let G r +i = V + /i ...l q . (ALi ^ = hi-i + hi) => G r , which can be 
verified by expressing Vy as ->3y->. 

After eliminating all quantifiers as described above, we obtain a formula of the form 
3 + l. \U\ = I A G p +i(l). We define the result of the algorithm, denoted a(Fo), to be the 
PA sentence G p+ i(MAXC). 

This completes the description of the algorithm a. Given that the validity of PA 
sentences is decidable, the algorithm a is a decision procedure for BAPA sentences. 

Theorem 2. The algorithm a described above maps each BAPA-sentence Fq into an 
equivalent PA-sentence a(Fo). 

Formalization of the algorithm a. To formalize the algorithm a, we have imple- 
mented it the functional programming language O'Caml; Section 12 contains the source 
code of the implementation. As an illustration, when we run the implementation on the 
BAPA formula in Figure 6 which represents a verification condition, we immediately 
obtain the PA formula in Figure 1 1 . Note that the structure of the resulting formula 
mimics the structure of the original formula: every set quantifier is replaced by the cor- 
responding block of quantifiers over non-negative integers constrained to partition the 
previously introduced integer variables. Figure 12 presents the correspondence between 
the set variables of the BAPA formula and the integer variables of the translated PA for- 
mula. Note that the relationship content' = content U e translates into the conjunction 
of the constraints [content' (~1 (content U e) c | = A |(content U e) n content' c | = 0, 
which reduces to the conjunction Zi o = A foil + fooi + hw = using the transla- 
tion of set expressions into the disjoint union of partitions, and the correspondence in 
Figure 12. 

The subsequent sections explore the consequences of the existence of the algorithm 
a, including an upper bound on the computational complexity of BAPA sentences and 
the combination of BA with proper extensions of PA. We explain our experience with 
using the implementation in Section 6. 

5 Complexity 

In this section we analyze the algorithm a from Section 4 and obtain space and time 
bounds on BAPA from the corresponding space and time bounds for PA. We then show 
that the new decision procedure meets good worst-case space bounds for BA if applied 
to BA formulas. Moreover, by construction, our procedure reduces to the procedure for 
Presburger arithmetic formulas if there are no set quantifiers. In summary, our decision 
procedure is reasonable for BA, does not impose any overhead for pure PA formulas, 
and the complexity of the general BAPA validity has the same height of the tower of 
exponentials as the complexity of PA itself. 
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general relationship: 


,---,ik z 
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q = S-(k-i) 


S 


- number of set variables 




in this example: 




seti = content' 




set2 = content 




set3 = e 


looo = 


content' 2 l~l content c n e 2 | 


'oOl = 


content' 2 n content 2 n e| 


low = 


content' 2 n content n e 2 | 


'oil — 


content' 2 n content l~l e\ 


hoo = 


content' l~l content 2 n e 2 | 


hoi — 


content' n content c n e| 


hw = 


content' n content n e 2 | 


'm = 


content' n content n e| 



V+'i.V+'o- MAXC = Zi + ' =► 
V + ?ii.V + /oi-V + /io.V + ?oo. 

h = 'n + 'oi A /o = Zio + 'oo => 
V + Zin. V + /on- V + Zioi. V + Zooi- 
V + 'no. V + Zoio- V + /ioo- V + 'ooo- 
hi = hii + Ion A loi = hoi + looi A 
ho = 'no + low A loo — hoo + looo => 
V size. y size . 

(hn + Ion + hoi + looi = 1 A 
hn +lou = A 
'in + 'on + 'no + 'oio = size A 
hoo = A 

'on + 'ooi + 'oio = A 
size' — size + 1) ^> 
(0 < size' A 
'm + 'ioi + 'no + 'ioo = size') 

Fig. 11. The translation of the BAPA sentence Fi 8- 12 - Th e Correspondence between In- 
from Figure 6 into a PA sentence teger Variables in Figure 1 1 and Set Vari- 

ables in Figure 6 

5.1 An Elementary Upper Bound 

We next show that the algorithm in Section 4 transforms a BAPA sentence Fq into a PA 
sentence whose size is at most one exponential larger and which has the same number 
of quantifier alternations. 

If F is a formula in prenex form, let s\ze(F) denote the size of F, and let alts(.F) 
denote the number of quantifier alternations in F, Define the iterated exponentiation 
function exp fe (x) by exp (x) = x and exp fe+1 (a;) = 2 exPk ^ x \ We have the following 
lemma. 

Lemma 3. For the algorithm a from Section 4 there is a constant c > such that 
size(a(Fo)) < 2 c ' s,:ze ^ Fo ^ and alts(a(Fo)) = alts(Fo). Moreover, the algorithm a runs 
in 2°( 5ize ( i? o)) space. 

We next consider the worst-case space bound on BAPA. Recall first the following 
bound on space complexity for PA. 

Fact 1 [20, Chapter 3] The validity of a PA sentence of length n can be decided in 

space exp 2 (0(n)). 

From Lemma 3 and Fact 1 we conclude that the validity of BAPA formulas can be 

decided in space exp 3 (0(n)). It turns out, however, that we obtain better bounds on 

BAPA validity by analyzing the number of quantifier alternations in BA and BAPA 

formulas. 

Fact 2 [61] The validity of a PA sentence of length n and the number of quantifier 
alternations m can be decided in space 2™ 

From Lemma 3 and Fact 2 we obtain our space upper bound, which implies the upper 
bound on deterministic time. 
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Theorem 4. The validity of a BAPA sentence of length n and the number of quantifier 
alternations m can be decided in space exp 2 (0(mn)), and, consequently, in determin- 
istic time exp 3 (O(mn)). 

If we approximate quantifier alternations by formula size, we conclude that BAPA va- 
lidity can be decided in space exp 2 ((3(n 2 )) compared to exp 2 (0(n)) bound for Pres- 
burger arithmetic from Fact 1 . Therefore, despite the exponential explosion in the size 
of the formula in the algorithm a, thanks to the same number of quantifier alternations, 
our bound is not very far from the bound for Presburger arithmetic. 

5.2 BA as a Special Case 

We next analyze the result of applying the algorithm a to a BA sentence Fq. By a 
BA sentence we mean a BA sentence without cardinality constraints, containing only 
the standard operations n, U, c and the relations C, =, At first, it might seem that the 
algorithm a is not a reasonable approach to deciding BA formulas given that the best 
upper bounds for PA are worse than the corresponding bounds for BA. However, we 
identify a special form of PA sentences PAba = {ce(Fo) | i*o is m BA} and show 
that such sentences can be decided in 2°( n > space, which is good for BA [32]. Our 
analysis shows that using binary representations of integers that correspond to the sizes 
of sets achieves a similar effect to representing these sets as bitvectors, although the two 
representations are not identical. 

Let S be the number of set variables in the initial formula Fq (recall that set variables 
are the only variables in Fq). Let l\, . . . , l q be the set of free variables of the formula 
G r (l\, . . . , l q ); then q = 2 e for e = S + 1 — r. Let w\, . . . , w q be integers specifying 
the values of h, . . . , l q . We then have the following lemma. 

Lemma 5. For each r where 1 < r < S the truth value ofG r {w\, . . . ,w q ) is equal to 
the the truth value ofG r {w\, . . . , w q ) where Wi = m.m(wi, 2 r ~ 1 ). 

Now consider a formula Fq of size n with S free variables. Then a(Fo) = Gs+i- 
By Lemma 3, s\ze(a(Fo)) is 0(nS2 s ). By Lemma 5, it suffices for the outermost vari- 
able k to range over the integer interval [0, 2 s ], and the range of subsequent variables 
is even smaller. Therefore, the value of each of the 2 s +1 — 1 variables can be repre- 
sented in 0(5*) space, which is the same order of space used to represent the names 
of variables themselves. This means that evaluating the formula a(Fo) can be done in 
the same space 0(nS2 s ) as the size of the formula. Representing the valuation assign- 
ing values to variables can be done in 0(S2 S ) space, so the truth value of the formula 
can be evaluated in 0(nS2 s ) space, which is certainly 2°( n \ We obtain the following 
theorem. 

Theorem 6. If Fa is a pure B A formula with S variables and of size n, then the truth 
value ofa(Bo) can be computed in 0(nS2 s ) and therefore 2 ( n ' space. 

6 Experience Using Our Decision Procedure for BAPA 

We have experimented with BAPA in the context of Jahob system [34] for verifying data 
structure consistency of Java programs. Jahob parses Java source code annotated with 
formulas in Isabelle syntax written in comments, generates verification conditions, and 
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uses decision procedures and theorem provers to discharge these verification conditions. 
Jahob currently contains interfaces to the Isabelle interactive theorem prover [51], the 
Simplify theorem prover [17] as well as the Omega Calculator [60] and the LASH [44] 
decision procedures for PA. 

Using Jahob, we have generated verification conditions for several Java program 
fragments that require reasoning about sets and their cardinalities, for example proving 
the equality relation between the number of elements in a list and the integer field size 
after they have been updated. Formulas arising from examples in Section 3 have also 
been discharged using our current implementation. We have found that Simplify is able 
to deal with some of the formulas involving only sets or only integers, but not with 
formulas that relate cardinalities of operations on sets to cardinalities of the individual 
sets. These formulas can be proved in Isabelle, but require user interaction in terms 
of auxiliary lemmas. On the other hand, our implementation of the decision procedure 
automatically discharges these formulas. 

Our current implementation makes use of some transformations and simplifications 
to reduce formula sizes. We find that eliminating set variables early by substitution is 
a highly effective optimization. When using Omega Calculator as the backend for our 
system, we also observed that lifting quantifiers to the top level noticeably improve 
performance. These transformations effectively extend the range of formulas that the 
current system can handle. Our current implementation of the decision procedure and 
example formulas can be found on the website [33]. 

7 Related Work 

Our paper is the first result that shows a complexity bound for the first-order theory 
of BAPA. The decidability for BAPA, presented as BA with equicardinality constraints 
was presented in [19] (see Section 2). A decision procedure for a special case of BAPA 
was presented in [79], which allows only quantification over elements but not over sets 
of elements. BAPA is a more general language because singleton sets can represent 
elements, so quantification over sets allows modelling quantification over elements. [62] 
(which appeared after [38]) shows the decidability of BA with constant cardinalities. 
Presburger arithmetic. The original result on decidability of PA is [59]. The best 
known bound on formula size is [20]. This decision procedure was improved in [16] 
and subsequently in [52]. An analysis based on the number of quantifier alternations is 
presented in [6 1 ] . Our implementation uses quantifer-elimination based Omega test [60] 
which, in our current experience, outperforms other implementations we have tried. 
Among the decision procedures for full PA, [13] is the only proof-generating version, 
and is based on a version of [16]. Decidable fragments of arithmetic that go beyond PA 
include MSOL over strings [11,31] and [9]. 

Boolean Algebras. The first results on decidability of BA are from [45], [1, Chap- 
ter 4] and use quantifier elimination, from which one can derive small model prop- 
erty; [32] gives the complexity of the satisfiability problem. [48] studies unification in 
Boolean rings. The quantifier-free fragment of BA is shown NP-complete in [47]; see 
[39] for a generalization of this result using parameterized complexity of the Bernays- 
Schonfinkel-Ramsey class of first-order logic [8, Page 258] which can be decided us- 
ing [24] or [7]. [12] gives an overview of several fragments of set theory including 
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theories with quantifiers but no cardinality constraints and theories with cardinality 
constraints but no quantification over sets. Quantifier-free formulas are also used in 
constraint solving [2,6, 15]. Among the systems for interactively reasoning about richer 
theories of sets are Isabelle [51], HOL [26], PVS [53], TPS [3]; first-order frameworks 
such as Athena [4] can use axiomatizations of sets along with calls to resolution-based 
theorem provers such as Vampire [75] to reason about sets. 

Combinations of Decidable Theories. The techniques for combining quantifier-free 
theories [50,63] and their generalizations such as [71-73,77,78] are of great importance 
for program verification. Our paper shows a particular combination result for quantified 
formulas, which add additional expressive power in writing specifications. Among the 
general results for quantified formulas are the Feferman-Vaught theorem for products 
[19] and term powers [36, 37]. While we have found quantifiers to be useful in several 
contexts, many problems can be encoded in quantifier-free formulas, so it is interesting 
to consider a combination of BAPA with solvers for quantifier-free formulas [21,25,69]. 
Description logics [5] and two-variable logic with counting [27,54,58] support sets and 
cardinalities, and additionally support relations, but do not allow quantification over 
sets. 

Analyses of Dynamic Data Structures. In addition to the new technical results, one 
of the contributions of our paper is to identify the uses of our decision procedure for 
verifying data structure consistency. We have shown how BAPA enables the verifica- 
tion tools to reason about sets and their sizes. This capability is particularly important 
for analyses that handle dynamically allocated data structures where the number of ob- 
jects is statically unbounded [35,49,65,76]. Recently, these approaches were extended 
to handle the combinations of the constraints representing data structure contents and 
constraints representing numerical properties of data structures [14, 64] . Our result pro- 
vides a systematic mechanism for building precise and predictable versions of such 
analyses. Among other constraints used for data structure analysis, BAPA is unique 
in being a complete algorithm for an expressive theory that supports arbitrary quanti- 
fiers. As we have illustrated in Section 3, the use of quantifiers is important for proving 
verification conditions that include quantified annotations, for computing abstractions 
of program fragments that involve local variables, and for proving simulation relation 
conditions. We has also illustrated the use of BAPA for reasoning about termination of 
programs that manipulate dynamic data structures by associating integer variables with 
sizes of sets that specify the objects in data structures and using techniques for proving 
termination of programs with integers [55-57]. Other possible applications of our deci- 
sion procedure include query evaluation in constraint databases [62] and loop invariant 
inference [30]. 

8 Conclusion 

Motivated by static analysis and verification of relations between data structure content 
and size, we have presented an algorithm for deciding the first-order theory of Boolean 
algebras with Presburger arithmetic ( BAPA), showed an elementary upper bound on the 
worst-case complexity, implemented the algorithm and applied it to several reasoning 
tasks. Our experience indicates that the algorithm will be useful as a component of a 
decision procedure of our data structure verification system. 



14 



Acknowledgements. We thank Alexis Bes for pointing out the relevance of [19, Section 
8], Chin Wei-Ngan for useful discussions on the analysis of data structure size con- 
straints and useful comments on a version of this paper, Calogero Zarba for comments 
on an earlier version of this paper, Peter Revesz for pointing to his recent paper [62], 
Andreas Podelski for discussions about transition relations, Bruno Courcelle on remarks 
regarding undecidability of MSOL with equicardinality constraints, Cesare Tinelli and 
Konstantin Korovin on discussions of Bernays-Schonfinkel-Ramsey class, the members 
of the Stanford REACT group and the Berkeley CHESS group on useful discussions on 
decision procedures and program analysis. 

References 

1. W. Ackermann. Solvable Cases of the Decision Problem. North Holland, 1954. 

2. Alex Aiken, Dexter Kozen, Moshe Vardi, and Ed Wimmers. The complexity of set con- 
straints. In Proceedings of Computer Science Logic 1993, pages 1-17, September 1993. 

3. Peter B. Andrews, Sunil Issar, Dan Nesmith, and Frank Pfenning. The tps theorem proving 
system. In 10th CADE, volume 449 of LNAI, pages 641-642, 1990. 

4. Konstantine Arkoudas, Karen Zee, Viktor Kuncak, and Martin Rinard. Verifying a file sys- 
tem implementation. In Sixth International Conference on Formal Engineering Methods 
(ICFEM'04), volume 3308 of LNCS, Seattle, Nov 8-12, 2004 2004. 

5. Franz Baader, Diego Calvanese, Deborah McGuinness, Daniele Nardi, and Peter Patel- 
Schneider, editors. The Description Logic Handbook: Theory, Implementation and Appli- 
cations. Cambridge University Press, 2003. 

6. Leo Bachmair, Harald Ganzinger, and Uwe Waldmann. Set constraints are the monadic class. 
In Logic in Computer Science, pages 75-83, 1993. 

7. Peter Baumgartner and Cesare Tinelli. The Model Evolution Calculus. In Franz Baader, 
editor, CADE- 19 - The 19th International Conference on Automated Deduction, volume 
2741 of Lecture Notes in Artificial Intelligence, pages 350-364. Springer, 2003. 

8. Egon Borger, Erich Gradel, and Yuri Gurevich. The Classical Decision Problem. Springer- 
Verlag, 1997. 

9. M. Bozga and R. Iosif. On decidability within the arithmetic of addition and divisibility. In 
FOSSACS'05, 2005. 

10. V. Bruyere, G. Hansel, C. Michaux, and R. Villemaire. Logic and p-recognizable sets of 
integers. Bull. Belg. Math. Soc. Simon Stevin, 1:191-238, 1994. 

11. J. R. Buchi. Weak second-order arithmetic and finite automata. Z Math. Logik Grundl. 
Math., 6:66-92, 1960. 

12. Domenico Cantone, Eugenio Omodeo, and Alberto Policriti. Set Theory for Computing. 
Springer, 2001. 

13. Amine Chaieb and Tobias Nipkow. Generic proof synthesis for presburger arithmetic. Tech- 
nical report, Technische Universitat Miinchen, October 2003. 

14. Wei-Ngan Chin, Siau-Cheng Khoo, and Dana N. Xu. Extending sized types with with col- 
lection analysis. In ACM SIGPLAN Workshop on Partial Evaluation and Semantics Based 
Program Manipulation (PEPM'03), 2003. 

15. Brahim Hnich Christian Bessiere, Emmanuel Hebrard and Toby Walsh. Disjoint, partition 
and intersection constraints for set and multiset variables. In CP'04, pages 138-152, 2004. 

16. D. C. Cooper. Theorem proving in arithmetic without multiplication. In B. Meltzer and 
D. Michie, editors, Machine Intelligence, volume 7, pages 91-100. Edinburgh University 
Press, 1972. 

17. David Detlefs, Greg Nelson, and James B. Saxe. Simplify: A theorem prover for program 
checking. Technical Report HPL-2003-148, HP Laboratories Palo Alto, 2003. 



15 



18. Robert K. Dewar. Programming by refinement, as exemplified by the SETL representation 
sublanguage. Transactions on Programming Languages and Systems, July 1979. 

19. S. Feferman and R. L. Vaught. The first order properties of products of algebraic systems. 
Fundamenta Mathematicae, 47:57-103, 1959. 

20. Jeanne Ferrante and Charles W. Rackoff. The Computational Complexity of Logical Theo- 
ries, volume 718 of Lecture Notes in Mathematics. Springer- Verlag, 1979. 

21. Cormac Flanagan, Rajeev Joshi, Xinming Ou, and James B. Saxe. Theorem proving using 
lazy proof explication. In CAV, pages 355-367, 2003. 

22. Robert W. Floyd. Assigning meanings to programs. In Proc. Amer. Math. Soc. Symposia in 
Applied Mathematics, volume 19, pages 19-31, 1967. 

23. Vijay Ganesh, Sergey Berezin, and David L. Dill. Deciding presburger arithmetic by model 
checking and comparisons with other methods. In Formal Methods in Computer-Aided De- 
sign. Springer- Verlag, November 2002. 

24. H. Ganzinger and K. Korovin. Integrating equational reasoning into instantiation-based the- 
orem proving. In Computer Science Logic (CSL'04), volume 3210 of Lecture Notes in Com- 
puter Science, pages 71-84. Springer, 2004. 

25. Harald Ganzinger, George Hagen, Robert Nieuwenhuis, Albert Oliveras, and Cesare Tinelli. 
DPLL(T): Fast decision procedures. In R. Alur and D. Peled, editors, Proceedings of the 16th 
International Conference on Computer Aided Verification, CAV '04 (Boston, Massachusetts), 
volume 3114 of Lecture Notes in Computer Science, pages 175-188. Springer, 2004. 

26. M. J. C. Gordon and T. F. Melham. Introduction to HOL, a theorem proving environment for 
higher-order logic. Cambridge University Press, Cambridge, England, 1993. 

27. Erich Gradel, Martin Otto, and Eric Rosen. Two-variable logic with counting is decidable. 
In Proceedings of 12th IEEE Symposium on Logic in Computer Science L1CS '97, Warschau, 
1997. 

28. C. A. R. Hoare. An axiomatic basis for computer programming. Communications of the 
ACM, 12(10):576-580, 1969. 

29. Wilfrid Hodges. Model Theory, volume 42 of Encyclopedia of Mathematics and its Appli- 
cations. Cambridge University Press, 1993. 

30. Deepak Kapur. Automatically generating loop invariants using quantifier elimination. In 
IMACS Intl. Conf. on Applications of Computer Algebra, 2004. 

31. Nils Klarlund, Anders M0ller, and Michael I. Schwartzbach. MONA implementation se- 
crets. In Proc. 5th International Conference on Implementation and Application of Automata. 
LNCS, 2000. 

32. Dexter Kozen. Complexity of boolean algebras. Theoretical Computer Science, 10:221-247, 
1980. 

33. Viktor Kuncak. BAPA web page, http://www.mit.edu/~vkuncak/projects/bapa/, 2004. 

34. Viktor Kuncak. The Jahob project web page, http://www.mit.edu/~vkuncak/projects/jahob/, 
2004. 

35. Viktor Kuncak, Patrick Lam, and Martin Rinard. Role analysis. In Proc. 29th POPL, 2002. 

36. Viktor Kuncak and Martin Rinard. On the theory of structural subtyping. Technical Report 
879, Laboratory for Computer Science, Massachusetts Institute of Technology, 2003. 

37. Viktor Kuncak and Martin Rinard. Structural subtyping of non-recursive types is decidable. 
In Eighteenth Annual IEEE Symposium on Logic in Computer Science, 2003. 

38. Viktor Kuncak and Martin Rinard. The first-order theory of sets with cardinality constraints 
is decidable. Technical Report 958, MIT CSAIL, July 2004. 

39. Viktor Kuncak and Martin Rinard. Decision procedures for set- valued fields. In 1st Interna- 
tional Workshop on Abstract Interpretation of Object-Oriented Languages (AIOOL 2005), 
2005. 

40. Patrick Lam, Viktor Kuncak, and Martin Rinard. Generalized typestate checking using set 
interfaces and pluggable analyses. SIGPLAN Notices, 39:46-55, March 2004. 



16 



41. Patrick Lam, Viktor Kuncak, and Martin Rinard. On our experience with modular pluggable 
analyses. Technical Report 965, MIT CSAIL, September 2004. 

42. Patrick Lam, Viktor Kuncak, and Martin Rinard. Cross-cutting techniques in program spec- 
ification and analysis. In 4th International Conference on Aspect-Oriented Software Devel- 
opment (AOSD 2005), 2005. 

43. Patrick Lam, Viktor Kuncak, and Martin Rinard. Generalized typestate checking for data 
structure consistency. In 6th International Conference on Verification, Model Checking and 
Abstract Interpretation, 2005. 

44. LASH. The LASH Toolset, http : //www.montef iore .ulg.ac.be/~boigelot/ 
re search /lash/. 

45. L. Loewenheim. Uber mogligkeiten im relativkalkiil. Math. Annalen, 76:228-251, 1915. 

46. Nancy Lynch and Frits Vaandrager. Forward and backward simulations - Part I: Untimed 
systems. Information and Computation, 121(2), 1995. 

47. Kim Marriott and Martin Odersky. Negative boolean constraints. Technical Report 94/203, 
Monash University, August 1994. 

48. Ursula Martin and Tobias Nipkow. Boolean unification: The story so far. Journal of Symbolic 
Computation, 7(3):275-293, 1989. 

49. Anders M0ller and Michael I. Schwartzbach. The Pointer Assertion Logic Engine. In Proc. 
ACM PLDI, 2001. 

50. Greg Nelson. Techniques for program verification. Technical report, XEROX Palo Alto 
Research Center, 1981. 

51. Tobias Nipkow, Lawrence C. Paulson, and Markus Wenzel. Isabelle/HOL: A Proof Assistant 
for Higher-Order Logic, volume 2283 of LNCS. Springer- Verlag, 2002. 

52. Derek C. Oppen. Elementary bounds for presburger arithmetic. In Proceedings of the fifth 
annual ACM symposium on Theory of computing, pages 34-37. ACM Press, 1973. 

53. S. Owre, J. M. Rushby, , and N. Shankar. PVS: A prototype verification system. In Deepak 
Kapur, editor, 11th CADE, volume 607 of LNAI, pages 748-752, jun 1992. 

54. Leszek Pacholski, Wieslaw Szwast, and Lidia Tendera. Complexity results for first-order 
two-variable logic with counting. SIAM J. on Computing, 29(4): 1083-1 117, 2000. 

55. Andreas Podelski and Andrey Rybalchenko. A complete method for synthesis of linear 
ranking functions. In VMCAI'04, 2004. 

56. Andreas Podelski and Andrey Rybalchenko. Transition invariants. In LICS '04, 2004. 

57. Andreas Podelski and Andrey Rybalchenko. Transition predicate abstraction and fair termi- 
nation. In POPL'05, 2005. 

58. Ian Pratt-Hartmann. Complexity of the two- variable fragment with (binary-coded) counting 
quantifiers. CoRR, cs.LO/0411031, 2004. 

59. M. Presburger. iiber die vollstandigkeit eines gewissen systems der aritmethik ganzer zahlen, 
in welchem die addition als einzige operation hervortritt. In Comptes Rendus du premier 
Congres des Mathematiciens des Pays slaves, Warsawa, pages 92-101, 1929. 

60. William Pugh. The Omega test: a fast and practical integer programming algorithm for de- 
pendence analysis. In Supercomputing '91: Proceedings of the 1991 ACM/IEEE conference 
on Supercomputing, pages 4-13. ACM Press, 1991. 

61. C. R. Reddy and D. W. Loveland. Presburger arithmetic with bounded quantifier alternation. 
In Proceedings of the tenth annual ACM symposium on Theory of computing, pages 320-325. 
ACM Press, 1978. 

62. Peter Revesz. Quantifier-elimination for the first-order theory of boolean algebras with lin- 
ear cardinality constraints. In Proc. Advances in Databases and Information Systems (AD- 
BIS'04), volume 3255 of LNCS, 2004. 

63. Harald Ruess and Natarajan Shankar. Deconstructing Shostak. In Proc. 16th IEEE LICS, 
2001. 



17 



64. Radu Rugina. Quantitative shape analysis. In Static Analysis Symposium (SAS'04), 2004. 

65. Mooly Sagiv, Thomas Reps, and Reinhard Wilhelm. Parametric shape analysis via 3-valued 
logic. ACM TOPLAS, 24(3):2n-298, 2002. 

66. E. Schonberg, J. T. Schwartz, and M. Sharir. An automatic technique for selection of data 
representations in Setl programs. Transactions on Programming Languages and Systems, 
3(2):126-143, 1991. 

67. Thoralf Skolem. Untersuchungen iiber die Axiome des Klassenkalkiils and iiber 
"Produktations- und Summationsprobleme", welche gewisse Klassen von Aussagen betre- 
ffen. Skrifter utgit av Vidnskapsselskapet i Kristiania, I. klasse, no. 3, Oslo, 1919. 

68. Larry Stockmeyer and Albert R. Meyer. Cosmological lower bound on the circuit complexity 
of a small problem in logic. J. ACM, 49(6):753-784, 2002. 

69. A. Stump, C. Barrett, and D. Dill. CVC: a Cooperating Validity Checker. In 14th Interna- 
tional Conference on Computer- Aided Verification, 2002. 

70. Wolfgang Thomas. Languages, automata, and logic. In Handbook of Formal Languages 
Vol. 3: Beyond Words. Springer- Verlag, 1997. 

71. Cesare Tinelli. Cooperation of background reasoners in theory reasoning by residue sharing. 
Journal of Automated Reasoning, 30(1): 1—31, January 2003. 

72. Cesare Tinelli and Calogero Zarba. Combining non-stably infinite theories. Journal of 
Automated Reasoning, 2004. (Accepted for publication). 

73. Ashish Tiwari. Decision procedures in automated deduction. PhD thesis, Department of 
Computer Science, State University of New York at Stony Brook, 2000. 

74. John Venn. On the diagrammatic and mechanical representation of propositions and reason- 
ings. Dublin Philosophical Magazine and Journal of Science, 9(59): 1-18, 1880. 

75. Andrei Voronkov. The anatomy of Vampire (implementing bottom-up procedures with code 
trees). Journal of Automated Reasoning, 15(2):237-265, 1995. 

76. Greta Yorsh, Thomas Reps, and Mooly Sagiv. Symbolically computing most-precise abstract 
operations for shape analysis. In 10th TACAS, 2004. 

77. Calogero G. Zarba. The Combination Problem in Automated Reasoning. PhD thesis, Stan- 
ford University, 2004. 

78. Calogero G Zarba. Combining sets with elements. In Nachum Dershowitz, editor, Veri- 
fication: Theory and Practice, volume 2772 of Lecture Notes in Computer Science, pages 
762-782. Springer, 2004. 

79. Calogero G. Zarba. A quantifier elimination algorithm for a fragment of set theory involving 
the cardinality operator. In 18th International Workshop on Unification, 2004. 

80. Karen Zee, Patrick Lam, Viktor Kuncak, and Martin Rinard. Combining theorem proving 
with static analysis for data structure consistency. In International Workshop on Software 
Verification and Validation (SVV2004), Seattle, November 2004. 

APPENDIX 
9 Proofs of Lemmas 

Lemma 1 Let bi, . . . ,b n be finite disjoint sets, and h, . . . , l n , fci, . . . , k n be natural 
numbers. Then the following two statements are equivalent: (1) There exists a finite set 
y such that A™=i 1^* ^ v\ ~ ^i A \bi fl y c \ — U and (2) /\™=i 1^*1 = ^» + '*• Moreover, 
the statement continues to hold if for any subset of indices i the conjunct \biC\y\ = ki 
is replaced by \bi (~]y\ > ki or the conjunct \bi (~l y c \ — h is replaced by \bi (~1 y c \ > U, 
provided that \bi\ = ki +U is replaced by \b%\ > k% + U, as indicated in Figure 10. 

Proof. (=*>) Suppose that there exists a set y satisfying (1). Because bi Hy and bi fl y c 

are disjoint, |6,| = \bi fl y\ + \bi fl y c \, so \bi\ = h + k when the conjuncts are 
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\bi n y\ — ki A \bi n y c \ — h, and \b%\ > fc, + Zj if any of the original conjuncts have 
inequality. 

(<*=) Suppose that (2) holds. First consider the case of equalities. Suppose that \bi\ = 
ki + U for each of the pairwise disjoint sets 61, ... , b n . For each b% choose a subset 
yi C 6j such that |j/j| = ki. Because \b%\ = ki + li, we have |6j fl y%\ = Zj. Having 
chosen yi, . . . , y n , let y = U"=i J/»- F° r ^ j we have 6j fl j/j = and bj (~l y| = 6j, so 
bif^y ~ yi and biP\y c — bill y£. By the choice of y^, we conclude that y is the desired 
set for which (1) holds. The case of inequalities is analogous: for example, in the case 
\b t n y\ > h A |6, l~l y c | = h, choose y 4 C b % such that |y;| = |6»| - k- 

Lemma 3. For the algorithm a from Section 4 there is a constant c > such that 
size(a(F )) < 2 cslze ( F °' and a Its (a (-Fo)) = alts(Fo). Moreover, the algorithm a runs 
in 2°( size ( F ")) ^ace. 

Proof. To gain some intuition on the size of a(Fo) compared to the size of Fq, compare 
first the formula in Figure 1 1 with the original formula in Figure 6. Let n denote the 
size of the initial formula Fq and let S be the number of set variables. Note that the 
following operations are polynomially bounded in time and space: 1) transforming a 
formula into prenex form, 2) transforming relations 61 = 62 and 61 C 62 into the 
form \b\ = 0. Introducing set variables for each partition and replacing each \b\ with 
a sum of integer variables yields formula G\ whose size is bounded by 0(n2 s S) (the 
last S factor is because representing a variable from the set of K variables requires 
space log K). The subsequent transformations introduce the existing integer quantifiers, 
whose size is bounded by n, and introduce additionally 2 S_1 + ... + 2 + 1 = 2 s — 1 
new integer variables along with the equations that define them. Note that the defining 
equations always have the form l\ = l-a-i + hi and have size bounded by S. We 
therefore conclude that the size of a(F ) is 0(nS(2 s + 2 s )) and therefore 0{nS2 s ), 
which is certainly 0(2 cn ) for any c > 1. Moreover, note that we have obtained a more 
precise bound 0(nS2 s ) indicating that the exponential explosion is caused only by set 
variables. Finally, the fact that the number of quantifier alternations is the same in Fq 
and a(Fo) is immediate because the algorithm replaces one set quantifier with a block 
of corresponding integer quantifiers. 

Lemma 5. For each r where 1 < r < S the truth value ofG r (wi, . . . , w q ) is equal to 
the the truth value ofG r (w\, . . . , w q ) where Wi — min(wj, 2*'~ 1 ). 

Proof. We prove the claim by induction. For r = 1, observe that the translation of a 
quantifier-free part of the pure BA formula yields a PA formula F\ whose all atomic 
formulas are of the form l^ + . . . + li k = 0, which are equivalent to V 7 =i kj = 0. 
Therefore, the truth-value of F\ depends only on whether the integer variables are zero 
or non-zero, which means that we may restrict the variables to interval [0, 1]. 

For the inductive step, consider the elimination of a set variable, and assume that 
the property holds for G r and for all q tuples of non-negative integers w\, . . . ,w q . 
Let q' = q/2 and w[, . . . ,w', be a tuple of non-negative integers. We show that 
G r+ i(w' 1 , . . . ,w',) is equivalent to G r+ i(w' l7 . . . ,w' ,). 

Suppose first that G r +i(w[, . . . , w' ,) holds. Then for each w^ there are W21-1 and 
W2i such that w'i — «2i-i + «2i and G r (u\, . . . , u q ). We define witnesses wi, . . . ,w q 
as follows. If w'i < 2 r , then let w 2 i-i = «2i-i and w 2 i — u 2 i- If w', > 2 r then either 
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U2j-i > 2 r l or U2« > 2 r * (or both). If «2i-i > 2 r 1 , then let W2i-i — w^ — U2i 
and w 2i = u 2l . Note that G r {. .. ,w 2 i-i, ■■ •) <*=*> G r (. . . , u 2 i-i, ■ ■ •) ^=^ 
G r (. . . , 2 r ~ 1 , . . .) by induction hypothesis because both u 2 i-i > 2 r ~ 1 and w 2 i-i > 
2 r ~ 1 . For wi,...,w q chosen as above we therefore have w[ — w 2 i-i + w 2 i and 
G r (wi, . . . , w q ), which by definition of G r +i means that G r +i(w[, . . . , w' , ) holds. 

Conversely, suppose that G r +i (w[ , . . . , w' , ) holds. Then there are wi , . . . , w q such 
that G r (w\, . . . ,w q ) and w[ — w 2 i-\ + w 2i . If w 2 i-\ < 2 r ~ 1 and w 2i < w 2i then 
w[ < 2 r so let «2i-i = V)2i-i and u 2i = w 2i . If w 2 i-i > 2 r ~ 1 and w 2l > w 2i 
then let u 2i -\ = 2 7 "- 1 and u 2i = 2 r - 1 . If iy 2i _i > 2 r - 1 and w 2i < 2 7 "- 1 then let 
U2j-i = 2 r — w 2 i and U2i = W2i- By induction hypothesis we have G r (u\, . . . , u q ) = 
G r (wi, . . . , w q ). Furthermore, u 2 i-i + u 2 i — w' i7 so G r +i(w' l7 . . . , w' ,) by definition 

Of G r +l- 

10 BAPA with Potentially Infinite Sets 

We next sketch the extension of our algorithm a (Section 4) to the case when the uni- 
verse of the structure may be infinite, and the underlying language has the ability to 
distinguish between finite and infinite sets. Infinite sets are useful in program analy- 
sis for modelling pools of objects such as those arising in dynamic object allocation. 
This section presents an approach that avoids directly reasoning about cardinalities of 
infinite sets and thus remains within the language of PA. (As was observed in [19], an 
alternative is to use a generalization of PA that admits infinite cardinals.) 

We generalize the language of BAPA and the interpretation of BAPA operations as 
follows. 

1. Introduce unary predicate fin (6) which is true iff b is a finite set. The predicate 
fin (6) allows us to generalize our algorithm to the case of infinite universe, and 
additionally gives the expressive power to distinguish between finite and infinite 
sets. For example, using fin(6) we can express bounded quantification over finite or 
over infinite sets. 

2. Define |6| to be the integer zero if b is infinite, and the cardinality of b if b is finite. 

3. Introduce propositional variables denoted by letters such as p, q, and quantifica- 
tion over propositional variables. Extend also the underlying PA formulas with 
propositional variables, which is acceptable because a variable p can be treated 
as a shorthand for an integer from {0, 1} if each use of p as an atomic formula is 
interpreted as the atomic formula (p = 1). Our extended algorithm uses the equiv- 
alences fin(6) ^p to represent the finiteness of sets just as it uses the equations 
|6| = I to represent the cardinalities of finite sets. 

4. Introduce a propositional constant FINU such that fin(W) <^> FINU. This proposi- 
tional constant enables equivalence preserving quantifier elimination over the set 
of models that includes both models with finite universe U and the models with 
infinite universe U. 

Denote the resulting extended language BAPA 00 . 

The following lemma generalizes Lemma 1 for the case of equalities. 

Lemma 7. Let bi, . . . ,b n be disjoint sets, li, . . . ,l n , fci, . . . , k n be natural numbers, 
and pi, . . . ,p n , qi, . . . , q n be propositional values. Then the following two statements 
are equivalent: 
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1. There exists a set y such that 



f\ \hC\y\ = h A(fin(& i ny)«p l ) A (7) 

i=i \bi ny c \ = U/\ (fin(bj n y c ) & ft) 



2. 



/\ (Pi Ag* => |6i| = ki+k) A (8) 

»=i (fin(6i)4»(pj Agi)) 

Proof. (=>) Suppose that there exists a set y satisfying (7). From 6, = (6iny)U(6jny c ), 
we have fin(bj) <^>(pi A qi). Furthermore, if pi and qi hold, then both billy and 6» (~l y c 
are finite so the relation |6j| = |6j (~l y| + \bi fl y c | holds. 

(<=) Suppose that (8) holds. For each i we choose a subset yi C 6i, depending on 
the truth values of pi and (/j, as follows. 

1. If both pi and qi are true, then fin(&,) holds, so bi is finite. Choose j/j as any subset 
of 6i with fcj elements, which is possible since bi has fcj + ^ elements. 

2. If pi does not hold, but qi holds, then fin(b,) does not hold, so bi is infinite. Choose 
y'i as any finite set with li elements and let yi — bi\ y\ be the corresponding cofinite 
set. 

3. Analogously, if pi holds, but qi does not hold, then bi is infinite; choose j/j as any 
finite subset of bi with fcj elements. 

4. If pi and g^ are both false, then bi is also infinite; every infinite set can be written 
as a disjoint union of two infinite sets, so let y, be one such set. 

Let y — |J" =1 y%. As in the proof of Lemma 1, we have bi n y — yi and bi<T\y c — yf. 
By construction of j/i, . . . , y n we conclude that (7) holds. 

The algorithm a for BAPA°° is analogous to the algorithm for BAPA. In each step, 
the new algorithm maintains a formula of the form 

3 + h ...l q . 3pi ...p q . 

(A<=1 l S »l = Zi A ( fin («i) ^P<)) A Q r 

As in Section 4, the algorithm eliminates an integer quantifier 3k by letting G r +i = 
3k.G r and eliminates an integer quantifier Vfc by letting G r +i = \/k.G r . Furthermore, 
just as the algorithm in Section 4 uses Lemma 1 to reduce a set quantifier to integer 
quantifiers, the new algorithm uses Lemma 7 for this purpose. The algorithm replaces 

3y. 3 + h . . .l q . 3pi ...p q . 

(/\ q i=1 \s l \=l i A(rm(s i )<^pi)) A G r 



with 



for q' = q/2, and 



-ji...fy.3pi... P ;,. 
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G r +i = 3 + h...l q .3pi,...,p q . 

( /\i=l (P2.-1 A p 2l =S> J- = Z 2 »-l + fci) A 
(P'i-&(p2i-l Ap 2 i))) 

A G r 
For the quantifier Vy the algorithm analogously generates 

G r + 1 = V + Zl . . . J g . VjJl, . . . ,Pq. 

( A?li 0*2.-1 A p 2l =4> J- = hi-i + hi) A 
{Pi&(p2i-iAp 2i )j) 

=> G r 

After eliminating all quantifiers, the algorithm obtains a formula of the form 
3 + l.3p. \U\ = I A (fin(W) ^p) A G p+ i(i,p). We define the result of the algorithm 
to be the PA sentence G p+ i(MAXC, FINU). 

This completes our description of the generalized algorithm a for BAPA°°. The 
complexity analysis from Section 5 also applies to the generalized version. We also 
note that our algorithm yields an equivalent formula over any family of models. A sen- 
tence is valid in a set of models iff it is valid on each model. Therefore, the validity 
of a BAPA°° sentence F is given by applying to the formula a(F )(MAXC, FINU) 
a form of universal quantifier over all pairs (MAXC, FINU) that determine the char- 
acteristics of the models in question. For example, for the validity over the models 
with infinite universe we use a(Fo)(0, false), for validity over all finite models we use 
Vk.a(Fo)(k, true), and for the validity over all models we use the PA formula 
a(F o )(0, false) A Vfc.a(F )(fc, true). 

We therefore have the following result. 

Theorem 8. The algorithm above effectively reduces the validity of BAPA°° sentences 
to the validity ofPresburger arithmetic formulas with the same number of quantifier al- 
ternations, and the increase in formula size exponential in the number of set variables; 
the reduction works for each of the following: 1) the set of all models, 2) the set of 
models with infinite universe only, and 3) the set of all models with finite universe. 

11 Relationship with MSOL over Strings 

The monadic second-order logic (MSOL) over strings is a decidable logic that can 
encode Presburger arithmetic by encoding addition using one successor symbol and 
quantification over sets. This logic therefore simultaneously supports sets and integers, 
so it is natural to examine its relationship with BAPA. It turns out that there are two 
important differences between MSOL over strings and BAPA: 

1. BAPA can express relationships of the form \A\ = k where A is a set variable and 
k is an integer variable; such relation is not definable in MSOL over strings. 

2. In MSOL over strings, the sets contain binary digits of an integer whereas in BAPA 
the sets contain uninterpreted elements. 

Given these differences, a natural question is to consider the decidability of an extension 
of MSOL that allows stating relations |.A| = k where A is a set of digits and k is an in- 
teger variable. Note that by saying 3/c.|A| — k A\B\ — k we can express |.A| = |B|, so 
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we obtain MSOL with equicardinality constraints. However, extensions of MSOL over 
strings with equicardinality constraints are known to be undecidable; we review some 
reductions in Section 1 1 .2. Undecidability results such as these are what perhaps led to 
the conjectures that BAPA itself is undecidable [79, Page 12], In this paper we pointed 
out that BAPA is, in fact, decidable and proved that it has an elementary decision pro- 
cedure. Moreover, we present a combination of BA with MSOL over n-successors that 
is still decidable. 

11.1 Decidability of MSOL with Cardinalities on Uninterpreted Sets 

We next note that our algorithm also applies to monadic second-order logic of one 
successors, which is more expressive than PA itself [70, Page 400], [10]. 

Consider the multisorted language BAM SOL defined as follows. First, BAM SOL 
contains all relations of monadic second-order logic of one successors, whose variables 
range over strings over an n-ary alphabet and sets of such strings. Second, BAMSOL 
contains sets of uninterpreted elements and boolean algebra operations on them. Third, 
BAMSOL allows stating relationships of the form \x\ = k where x is a set of un- 
interpreted elements and k is a string representing a natural number. Because all PA 
operations are definable in MSOL of 1 -successor, the algorithm a applies in this case 
as well. Indeed, the algorithm a only needs a "lower bound" on the expressive power 
of the theory of integers that BA is combined with: the ability to state constraints of 
the form l[ = foi-i + hi, and quantification over integers. Therefore, applying a to a 
BAMSOL formula results in an MSOL formula. This shows that BAMSOL is decidable 
and can be decided using a combination of algorithm a and a tool such as [31]. By 
Lemma 3, the decision procedure for BAMSOL based on translation to MSOL has up- 
per bound of exp rl (0(rt)) using a decision procedure such as [31]. The corresponding 
non-elementary lower bound follows from the lower bound on MSOL itself [68]. 

11.2 Undecidability of MSOL of Integer Sets with Cardinalities 

We first note that there is a reduction from the Post Correspondence Problem that shows 
the undecidability of MSOL with equicardinality constraints. Namely, we can repre- 
sent binary strings by finite sets of natural numbers. In this encoding, given a position, 
MSOL itself can easily express the local property that, at a given position, a string con- 
tains a given finite substring. The equicardinality gives the additional ability of finding 
an n-th element of an increasing sequence of elements. To encode a PCP instance, it 
suffices to write a formula checking the existence of a string (represented as set A) and 
the existence of two increasing sequences of equal length (represented by sets U and 
D), such that for each i, there exists a pair (a,j, bj) of PCP instance such that the posi- 
tion starting at Ui contains the constant string cij, and Ui+\ = Ui + \a,j\, and similarly 
the position starting at Di contains bj and D,+i = Di + \bj\. 
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12 O'Caml source code of algorithm ol 
( * *) 

(* datatype of formulas *) 

type ident = string 

type binder = Forallset | Existsset 

I Forallint | Existsint | Forallnat | Exist snat 
type form = 

I Not of form 

I And of form list | Or of form list I Impl of form * form 

I Bind of binder * ident * form 

I Inteq of intTerm * intTerm | Less of intTerm * intTerm 

I Seteq of setTerm * setTerm | Subseteq of setTerm * setTerm 
and intTerm = 

I Intvar of ident | Const of int 

I Plus of intTerm list I Minus of intTerm * intTerm | Times of int * intTerm 

I Card of setTerm 
and setTerm = 

I Setvar of ident | Empty set | Full set I Complement of setTerm 

I Union of setTerm list | Inter of setTerm list 

let maxcard = "MAXC" 



( * *) 

(* algorithm \ alpha *) 

(* replace Seteq and Subseteq with Card(.,.)=0 *) 

let simplify_set_relations ( f : f orm) : form = 
let rec sform f = match f with 

Not f -> Not (sform f) 

And fs -> And (List .map sform f s ) 

Or fs -> Or (List .map sform f s ) 

Impl(fl,f2) -> Impl (sform fl, sform f2) 

Bind(b, id, fl) -> Bind (b, id, sform fl) 

Less (itl, it2) -> Less (itl, it2) 

lnteq(itl, it2) -> Inteq ( itl, it2 ) 

Seteq(stl, st2) -> And[sform (Subseteq (stl , st2 )} ; 
sform (Subseteq (st2, stl) ) ] 
I Subseteq(stl, st2) -> Inteq (Card ( Inter [ stl ; Complement st2]}, Const 0) 
in sform f 

(* split f into quantifier sequence and body *) 
let split_quants_body f = 

let rec vl f ace = match f with 

I Bind(b, id, fl) -> vl fl ( (b, id) : : ace) 

I f -> (ace, f ) 

in vl f [] 

(* extract set variables from quantifier sequence *) 
let extract_set_vars quants = 
List .map (fun (b, id) -> id) 

(List. filter (fun (b, id) -> (b=Forallset lib- Existsset)) 
quants) 

type partition - (ident * setTerm) list 

(* make canonical name for integer variable naming a cube * ) 
let make_name sts = 

let rec mk sts = match sts with 

I [] -> "" 

I (Setvar _): :stsl -> "1" " mk stsl 

I (Complement (Setvar _) ) : :stsl -> "0" " mk stsl 

I _ -> failwith "make_name: unexpected partition form" 

in "l_" *■ mk sts 

(* make all cubes over vs * ) 
let generate_partition (vs : ident list) : partition = 
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let add id ss = (Setvar id)::ss in 

let addc id ss = Complement (Setvar id) : :ss in 

let add_set id inters = 

List .map (add id) inters @ 

List .map (addc id) inters in 
let mk_nm is = (make_name is, Inter is) in 
List. map mk_nm 

(List .map List . rev 

(List .fold_right add_set vs [[]])) 

(* is the set term true in the set valuation 
-- reduces to propositi onal reasoning *) 
let istrue (strsetTerm) (id, ivaluation) : taool = 
let valuation = match ivaluation with 
I Inter v -> v 

I _ -> failwith "wrong valuation" in 
let lookup v = 

if List .mem (Setvar v) valuation then true 

else if List .mem (Complement (Setvar v) ) valuation then false 

else failwith "istrue: unbound var in valuation" in 
let rec check st = match st with 

Setvar v -> lookup v 

Emptyset -> false 

Fullset -> true 

Complement stl -> not (check stl ) 

Union sts -> List. f old_right (fun stl t -> check stl || t) sts false 

Inter sts -> List . f old_right (fun stl t -> check stl && t) sts true 
in check st 

(* compute cardinality of set expression 
as a sum of cardinalities of cubes *) 
let get_sum (p : partition) (st:setTerm) : intTerm list = 
let get_list (id, inter) = match inter with 
I Inter ss -> ss 

I _ -> failwith "failed inv in get_sum" 
in 

List .map (fun (id, inter) -> Intvar id) 
(List. filter (istrue st) p) 

(* replace cardinalities of sets with sums of 
variables denoting cube cardinalities *) 
let replace_cards (p : partition) ( f : f orm) : form = 
let rec repl f = match f with 

I Not f -> Not (repl f) 

I And fs -> And (List .map repl f s) 

I Or fs -> Or (List .map repl f s) 

I Impl(fl,f2) -> ImpKrepl fl,repl f2) 

I Bind(b, id, fl) -> Bind (b, id, repl fl) 

I Less (itl, it2) -> Less (irepl itl,irepl it2) 

I Inteq(itl, it2) -> Inteqfirepl itl, irepl it2) 

| Seteq(_,_) |Subseteq(_,_) -> failwith "failed inv in replace_cards " 
and irepl it = match it with 

I Intvar _ -> it 

I Const _ -> it 

I Plus its -> Plus (List .map irepl its) 

I Minus (itl, it2) -> Minus (irepl itl, irepl it2) 

I Times (k, itl) -> Times (k, irepl itl) 

I Card st -> Plus (get_sum p st) 
in repl f 

let apply_quants quants f = 

List . fold_right (fun (b, id) f -> Bind (b, id, f ) ) quants f 

let make_def ining_eqns id part = 
let rec mk ps = match ps with 
I [] -> [] 
I (idl, Inter (stl: :stsl) ) : : (id2, Inter (st2: :sts2) } : : psl 

when (stl=Setvar id && st2=Complement (Setvar id) && stsl=sts2) -> 
(Inter stsl, make_name stsl,idl,id2) :: mk psl 



25 



I _ -> failwith "make_triples : unexpected partition form" in 
let rename_last lss = match lss with 

I [(s,l,ll,12)] -> [ (s,maxcard, 11,12) ] 

| _ -> lss in 
rename_last (mk part ) 

( * *) 

(* main loop of the algorithm * ) 

let rec eliminate_all quants part gr = match quants with 

[] -> gr 

(Existsint , id) : :quantsl -> 

eliminate_all quant si part (Bind (Existsint, id, gr}) 

(Forallint, id) : rquantsl -> 

eliminate_all quant si part (Bind (Forallint, id, gr}) 

(Existsnat, id) : rquantsl -> 

eliminate_all quant si part (Bind (Existsnat, id, gr}) 

(Forallnat , id) : rquantsl -> 

eliminate_all quant si part (Bind (Forallnat, id, gr}) 

(Existsset, id) : rquantsl -> 

let eqns = make_def ining_eqns id part in 

let newpart = List .map (fun (s, 1' ,_,_) -> ( 1' , s) ) eqns in 

let mk_conj (_,!', 11, 12) - Inteq(Intvar 1' , Plus [ Intvar ll;Intvar 12]) in 
let con js = List .map mk_con j eqns in 

let 1 quants = List .map (fun (1,_) -> (Existsnat, 1)) part in 
let grl = apply_quants lquants (And (conjs @ [gr] ) ) in 
eliminate_all quant si newpart grl 
I (Forallset , id) : rquantsl -> 

let eqns = make_def ining_eqns id part in 

let newpart = List .map (fun (s, 1' ,_,_) -> (1' , s) ) eqns in 

let mk_conj (_,1',11,12) - Inteq(Intvar 1' , Plus [ Intvar 11; Intvar 12]) in 

let conjs = List .map mk_con j eqns in 

let lquants = List .map (fun (1,_) -> (Forallnat, 1)) part in 

let grl = apply_quants lquants (Impl (And conjs, gr) ) in 

eliminate_all quant si newpart grl 

(* putting everything together *) 



form 



let alpha (f:form) : form = 
(* assumes f in prenex fc 
let (quants, fm) = split_quants_body f in 
let fml = simplify_set_relations fm in 
let setvars = List. rev (extract_set_vars quants) in 
let part = generate_partition setvars in 
let gl = replace_cards part fml in 
eliminate_all quants part gl 
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